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LEARNING OBJECTIVES 

After completing this supplement 
you should be able to: 

Define reliability and state 
two ways of using it. 

Find probability of 
functioning when activated, 
and explain the purpose of 
redundancy in a system. 

Find probability of 
functioning for a given length 
of time, and define failure 
rate per hour, mean ti me to 
failure, and availability. 
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reliability The ability of 
a product or part to perform 
its intended function under a 
prescribed set of conditions. 


<3> INTRODUCTION 

Reliability is a measure of the ability of a product or part to perform its intended function 
under a prescribed set of conditions. In effect, reliability is a probability. 

Suppose that an item has a reliability of .90. This means that it has a 90 percent prob¬ 
ability of functioning as intended. The probability that it will fail, i.e., its failure rate, is 
1 - .90 = .10, or 10 percent. Hence, it is expected that, on the average, 1 out of every 10 
such items will fail or, equivalently, that the item will fail, on average, once in every 10 
trials. Similarly, a reliability of .985 implies 15 failures per 1,000 parts or trials. 

Reliability of a product or part is used in two ways. 

1. Reliability when activated 

2. Reliability fora given length of time 

The first of these focuses on one point in time and is often used when a product or part 
must operate for one time, such as a missile or an air bag in a car. The second of these 
focuses on the length of service, such as most other products e.g., a car. The distinction 
will become more apparent as each of these approaches is described in more detail. 

Reliability is an important dimension of product quality. Reliability management in¬ 
volves establishing, achieving, and maintaining reliability objectives for products, e.g., 
the expected life of a particular make of light bulb may be specified to be 5,000 hours. 
Achieving reliability usually falls on the shoulder of reliability engineers who use a vari¬ 
ety of techniques to build reliability into products (e.g., by using reliable key compo¬ 
nents), test their performance, and estimate their reliability. If the reliability is inadequate, 
the types of failure and their effect on the product should be determined, their root 
cause(s) identified, and potential failure prevented. 

We will mainly focus on reliability measurement, which involves statistics and proba¬ 
bility theory. The average reliability of a part is measured by testing several units over 
time until some or all fail. However, thistimemay be very long (several years).To accel¬ 
erate this, the items are stressed by using extreme environmental conditions such as high 
temperature, temperature cycles (e.g., hot-cold), high humidity, high vibration, high volt¬ 
age, surges in power, etc. The resulting life estimate is then adjusted appropriately. Reli¬ 
ability of a product is determined from the reliability of its parts. 



FINDING PROBABILITY OF FUNCTIONING 
WHEN ACTIVATED 


The probability that a part or product will operate as planned is an important concept in 
product design. Determining that probability when the product consists of a number of inde¬ 
pendent components requires the use of rules of probability for independent events. I ndepen- 
dent events have no relation to the occurrence or nonoccurrence of each other. W hat follows 
are examples illustrating the use of two probability rules to determine whether a given product 
will operate successfully. L et P, = probability that event / occurs, / = 1,2,3,... 


Rule 1 . If two or more events are independent and "success” is defined as the occurrence 
of all of the events, then the probability of success P s is equal to the product of the prob¬ 
abilities of the events occurring, i.e., P s = P 1 xP 2 x ••• 

Example. Suppose a room has two lamps, but to have adequate light both lamps must 
work (success) when turned on. Here the product is the lighting system that has two com¬ 
ponent lamps. One lamp has a probability of working of .90, and the other has a probabil¬ 
ity of working of .80. The probability that both will work is .90 x .80 = .72. 

This lighting system can be represented by the following diagram where the two com¬ 
ponents are connected in series: 


Lamp 1 


Lamp 2 
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CR0.9 —i■— CR0.98 —A— CR0.99 
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Number of components in product 


Even though the individual components of a series system (product) might have high 
reliability, the series system (product) as a whole can have considerably less reliability 
because all its components must function (i.e., the system is dependent on each of its 
components). As the number of components in a series system (product) increases, the 
system (product) reliability decreases. For example, a series system (product) that has 
eight components, each with a reliability of .99, has a reliability of only ,99 8 = .923. See 
Figure 4S-1 for plots of product reliability as a function of number of its components for 
selected component reliability, CR. 

M any products have a large number of component parts that must all operate, and 
some way to increase overall reliability is needed. One approach is to overdesign, i.e., 
enhance the design to avoid a particular type of failure. For example, use a more durable 
and higher quality (but more expensive) material in a product. Another is design simpli¬ 
fication, i.e., reduce the number of components in the product. The third approach is to 
use redundancy in the design. This involves providing backup components. 

Rule 2. If two or more events are independent and "success" is defined as occurrence of 
at least one of the events, then the probability of success P s is equal to 1 - probability that 
none of the events will occur, i.e., 1 — (1 — P : )(l — P 2 )(l — ^ 3 )-■■■ Simplifying, P S = P 1 + 

(1-P 1 )P 2 + (1-P 1 )(1-P 2 )P 3 + ... 

Example. There are two lamps in a room. When turned on, one has probability of work¬ 
ing of .90 and the other has probability of working of .80. Only a single lamp is needed to 
light the room for success (note that the threshold for success is different in this example). 
Then, probability of success P s = 1 - (1 - .90)(1 - .80) = .98. 

Conceptually, we can think of this system as a lamp with a backup. If the first lamp 
fails to light when turned on, the backup lamp is turned on. The probability of success P s 
is probability that the first lamp operates plus probability that the first lamp fails and the 
backup lamp operates, i.e., .90 + (1 - .90) x .80 = .98. 

This backup system can be represented by the following diagram. 



Example. Three lamps have probabilities of .90, .80, and .70 of lighting when turned on. 
Only one lighted lamp is needed for success. Then, probability of success P s = 1 - 
(1 - ,90)(1 - ,80)(1 - .70) = .994. 


Figure 4S-1 


Relating product and 
component reliabilities. 


redundancy T he use of 

backup components to increase 
reliability. 
















4 


PART THREE System Design 


Conceptually, we can think of this system as a lamp with a backup which in turn has a 
backup. If the first lamp fails to light when turned on, the second lamp is turned on. If the 
second lamp also fails to light when turned on, the third lamp is turned on. The probabil¬ 
ity of success P s is probability that the first lamp operates plus probability that the first 
lamp fails and the second lamp operates plus probability that the first and second lamps 
fail and the third lamp operates, i.e.: 


[#1 operates] + [#1 fails and #2 operates] + [#1 fails and #2 fails and #3 operates] 

.90 + (1-.90) x .80 + [(1 — .90) x (1-.80) x .70] = .994 

This double backup system can be represented by the following diagram: 



In general, a product (system) may be composed of some parallel components and 
some series components. The product's reliability is calculated in two stages: (a) first 
calculate the reliability of the parallel component(s) and then (b) use these to calculate 
the reliability of the resulting series system. 



Solution 


(a) The system can be reduced to a series of three components: 



(b) The system reliability is, then, the product of these component reliabilities: 
.98 x .99 x .996 = .966 



FINDING PROBABILITY OF FUNCTIONING 
FOR A GIVEN LENGTH OF TIME 


The second way of looking at reliability considers a use factor, usually the time dimen¬ 
sion: probabilities are determined relative to a specified length of time. This approach is 
most common. Product warranties, e.g., one year free repair, should be based on this 
definition of reliability. 

In this case, failure rate per hour is defined as the number of failures divided by total 
operating hours. 
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Two hundred units of a particular component were subjected to accelerated life testing 
equivalent to 2,500 hours of normal use. One unit failed after 1,000 hours and another 
after 2,000 hours. All other units were still working at the conclusion of the test. 

The failure rate per hour = 2/[198 (2,500) +1,000 + 2,000] = 0.000004016 per hour 

Note that this formula assumes constant failure rate over time. 


If failure rate is constant over time, time-quantity transposition is applicable, i.e., one 
can reduce the test time but use more test items. For example, in the above example, in¬ 
stead of 200 items tested for 2,500 hours, the reliability engineer could have used 400 
items tested for 1,250 hours. 

A typical profile of failure rate over time is shown in Figure 4S-2. Because of its 
shape, it is referred to as the bathtub curve. Usually, a number of products or parts fail 
shortly after they are put into service, because they are defective to begin with. Exam¬ 
ples include electronics components such as capacitors. The rate of failure decreases 
rapidly as the defective items are weeded out. During the second phase, random failures 
occur. In many cases, this phase covers a relatively long period of time (several years). 
In the third phase, failures occur because the items are worn out, and the failure rate 
increases. 

The following example illustrates the bathtub curve. In a reliability testing study, 1,000 
light bulbs were lighted until they failed. Each failure time was recorded. The number of 
light bulbs remaining (survivors) over time can be seen in Figure 4S-3. Note that initially 
and close to the end, there are sharp drops in the number of light bulbs remaining (survi¬ 
vors), reflecting the burn-in and wear-out phases, respectively. 




Example S-2 


Figure 4S-2 


Failure rate is generally a 
function of time and follows the 
bathtub curve. 


Figure 4S-3 


Number of light bulbs 
remaining over time. 


o 


Time 
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mean time to failure (MTTF) 

The average length of time 
before failure of a product or 
component. 

mean time between 
failures (MTBF) T he average 
time from the up time after the 
repair following a failure to the 
next failure. 


The inverse of failure rate per hour is mean time to failure (MTTF), the average length 
of time (in hours) before failure. For data in Example S2, 

MTTF = 1/Failure rate per hour = 1/0.000004016 = 249,000 hours 

N ote that this formula assumes that failure rate is constant. 

For repairable items, a similar term, mean time between failures (MTBF), is usually 
used. MTBF is the average ti me from the up ti me after the repai r fol Iowi ng a fai I ure to the 
next failure. 
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The time to failure of a non-repairable item during the steady state phase can often 
be modelled by the Exponential distribution with an average equal to the MTTF (see 
Figure 4S-4). Similar results hold for repairable items. The probability that the item put into 
service at time 0 will fail before some specified time, T, is equal to the area under the curve 
between 0 and T. Reliability of the item is the probability that it will last at least until time 
T; therefore, reliability is equal to the area under the curve beyond T. (Note that the total area 
under the curve is 100 percent.) Observe that as the specified length of service increases, 
the area under the curve to the right of that point (i.e., the reliability of the item) decreases. 

The Exponential distribution is completely described using a single parameter, its 
average, in this case the mean time to failure (or between failures). Using the symbol T to 
represent length of service, the reliability or probability that failure will not occur before 
timeT (i.e., the area in the right tail) is easily determined by: 

Reliability = P (no failure before T ) = e _77MTTF 
where 

e = 2.7183 

T = Length of service before failure 

MTTF = M ean time to failure 

The probability that failure will occur before timeT is 1 minus reliability: 

P (failure before T ) = 1 - e _17MTTF 

Selected values of e _T/MTTF (i.e., reliability), given values forT/MTTF are listed in 
Table 4S-1. 



Time 
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T/MTTF 

T/MTTF 

T/MTTF 

e - T/MTTF 

T/MTTF 

e -T/MTTF 1 

1 Table 4S-1 

.10 

.9048 

2.60 

.0743 

5.10 

.0061 

Values of reliability = e~ T/MTTF 

.20 

.8187 

2.70 

.0672 

5.20 

.0055 


o 

CO 

.7408 

2.80 

.0608 

5.30 

.0050 


.40 

.6703 

2.90 

.0550 

5.40 

.0045 


.50 

.6065 

3.00 

.0498 

5.50 

.0041 


.60 

.5488 

3.10 

.0450 

5.60 

.0037 


.70 

.4966 

3.20 

.0408 

5.70 

.0033 


o 

CO 

.4493 

3.30 

.0369 

5.80 

.0030 


.90 

.4066 

3.40 

.0334 

5.90 

.0027 


1.00 

.3679 

3.50 

.0302 

6.00 

.0025 


1.10 

.3329 

3.60 

.0273 

6.10 

.0022 


1.20 

.3012 

3.70 

.0247 

6.20 

.0020 


1.30 

.2725 

3.80 

.0224 

6.30 

.0018 


1.50 

.2231 

4.00 

.0183 

6.50 

.0015 


1.60 

.2019 

4.10 

.0166 

6.60 

.0014 


1.70 

.1827 

4.20 

.0150 

6.70 

.0012 


1.80 

.1653 

4.30 

.0136 

6.80 

.0011 


1.90 

.1496 

4.40 

.0123 

6.90 

.0010 


2.00 

.1353 

4.50 

.0111 

7.00 

.0009 


2.10 

.1255 

4.60 

.0101 




2.20 

.1108 

4.70 

.0091 




2.30 

.1003 

4.80 

.0082 




2.40 

.0907 

4.90 

.0074 




2.50 

.0821 

5.00 

.0067 





By means of extensive testing and data collection, a manufacturer has determined that a 
particular model of its vacuum cleaners has an expected life that is Exponential with a 
mean of four years and insignificant burn-in phase. Find the probability that one of these 
vacuum cleaners will have a life that ends: 


Example S-3 


a. After the initial four years of service. 

b. Before four years of service are completed. 

c. N ot before six years of service. 


MTTF =4 years 

a. T = 4 years: 


T/MTTF 


4 years 
4 years 


1.00 


Solution 


From Table 4S-1, e _100 = .3679. 

b. The probability of failure before T = 4 years is 1 -e -100 , or 1 - .3679 = .6321. 

c. T = 6 years: 


7/MTTF 


6 years 
4 years 


1.50 


From Table 4S-1, e _:L50 = .2231. 
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Figure 4S-5 


A Normal curve 


Example S-4 


Solution 



o 

z scale 


Mean life T 
Years 


M echanical items such as ball bearings, valves, and springs tend to have insignificant 
burn-in and steady-state phases, and start to wear out right away. Item failure due to wear- 
out can sometimes be modelled by a Normal distribution. Obtaining Normal probabilities 
involves the use of the standard Normal table (see Appendix B, Table B). The table pro¬ 
vides areas under a Normal curve up to a specified point z, wherez is a standardized 
value calculated using the formula: 

T - Mean wear-out time 
z — - 

Standard deviation of wear-out time 

This area is the probability that service life will not exceed some value T. To find the reli¬ 
ability, subtract this probability from 1. See Figure 4S-5. 

To obtain the value of T that will provide a given probability, work in reverse, i.e., lo¬ 
cate the nearest probability in Appendix B, Table B, and pick up the associated z value. 
Then, insert the z value in the above formula and solve for T. 


The mean life of a certain ball bearing can be modelled using a Normal distribution with a 
mean of six years and a standard deviation of one year. Determine each of the following: 

a. The probability that a ball bearing will fail before seven years of service. 

b. The probability that a ball bearing will fail after seven years of service (i.e., find its 
reliability). 

c. The service life that will provide a failure probability of 10 percent. 

Wear-out mean = 6 years 
Wear-out standard deviation = 1 year 
Wear-out is Normally distributed 
a. Calculate z using the above formula: 


7-6 
z =- 

1 


+ 1.00 



_I_L 

6 7 

Years 
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Use the calculated z to obtain the required probability from Appendix B, Table B. 
Thus ,P(T < 7) = P(z< 1) = .8413 (see the graph on the previous page). 

b. Subtract the probability determined in part a from 1.00 (seethe graph below). 


1.00-.8413 = 1587 


\ 




.1587 


0 + 1.00 


z scale 

_I_L 

6 7 

Years 


c. Use the standard Normal table in reverse, i.e., find the value of z that corresponds to 
an area under the curve (starting from the left side) of .10. Thus, z = -1.28 from 
Appendix B,Table B. Now, insert this in thezformula above: 


z = 1.28 = 


T-6 

1 



z scale 

_i_i_ 

4.72 6 

Years 


Solving for T, we find T = 4.72 years (seethe graph above). 


A more general distribution than Exponential istheWeibull distribution (seee.g., http:// 
en.wikipedia.org/wiki/Weibull_distribution). 

The probability density function of a Weibull random variablex is 


f(x;X,k) = 


0 




i 


(x/X ) 1 


x>0, 
x < 0, 


where k >0 is the shape parameter, X>0 is the scale parameter of the distribution, andx 
represents time t. If k = 1 the failure rate is constant overtime, and Weibull is identical to 
the Exponential distribution. If k< 1 failure rate decreases overtime, and if k> 1 failure 
rate increases over time. Because of its flexibility, Weibull distribution is commonly used 
to model the time to failure during the burn-in ( k< 1) and wear-out (k> 1) phases. 

Please note that the probability rules of Section 2 above for series and parallel systems 
can also be used to determine reliability of a system for a given length of time based on 
the reliability of its components over the same length of time. 
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OM in ACTION 


Windchill Quality Solutions 
(Formerly Relex Software) 1 

W indchill Quality Solutions' reliability software assists 
in designing products and parts with reduced chance 
of failure, collecting reliabiIity data, estimating failure rate and 
mean time between failures, finding the cause of failure, and 
many other reliability activities. To the right is a screenshot of 
the bill of material of the processing unit of a computer, with 
component failure rate (per million hours) provided from the 
library of the software, based on quality and expected operating 
conditions such as temperature. 


LnBM 

V ffc L*i Ucv iaofe Eta«cl AiiiotV St'otow tJ«|p 


JCJJL 



availability The fraction of 
time a piece of equipment or a 
repairable product is expected 
to be available for operation. 

mean time to repair The 

average length of time to repair 
a failed item. 


Availability 

For repairable items, the measure of importance to customers, and hence to designers, is 
availability. It measures the fraction of time a piece of equipment or a repairable product 
is expected to be avail able for operation (as opposed to being down for repair). Availabil¬ 
ity can range from zero (never available) to 1.00 (always available). Companies that can 
offer equipment with high availability have a competitive advantage over companies that 
offer equipment with lower availability. Availability is a function of both the mean time 
between failures and the mean time to repair (the average length of time to repair a 
failed item). We assume that there is little delay beforeafailed item begins to be repaired. 
The availability factor can be calculated using the following formula: 


Availability 


MTBF 

MTBF + MTTR 


where 

MTBF = M ean ti me between f ai I ures 
M TT R = M ean ti me to repai r 


Example S-5 


A copier is expected to operate for 200 hours after repair, and the mean repair time is 
expected to be two hours. Determine the availability of the copier. 


MTBF = 200 hours, and MTTR = 2 hours 

Availability = M TBF/(M TBF + MTTR) = 200/(200 + 2) = .99 


To increase availability, designers increase MTBF but also decrease M TTR. Laser print¬ 
ers, for example, are designed with print cartridges that can be easily replaced, thus requir¬ 
ing a small MTTR. 


Key Terms 


availability, 10 

mean time between failures (MTBF), 6 
mean timetofailure(MTTF), 6 


mean time to repair (MTTR), 10 
redundancy, 3 
reliability, 2 


^ttpy/www.ptc.com/products/windchill/quality. Image courtesy PTC-Windchill Quality Solutions (formerly 
Relex Software Corporation). 
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A product designer must decide if a redundant component is cost-justified in a product. The prod¬ 
uct in question has a critical component with a probability of .98 of operating. Product failure 
would involve a cost of $20,000. For a cost of $100, a switch and backup component could be 
added that would automatically transfer the control to the backup component in the event of a 
failure. Should the backup component be added if its operating probability is also .98? 

Because no probability is given for the switch, we will assume that its probability of operating 
when needed is 1.00. The expected cost of failure (i.e., without the backup) is $20,000 
(1 - .98) = $400. 

With the backup, the probability of notfailing would be: 

.98 + .02(.98) = .9996 

Hence, the probability of failure would be 1 - .9996 = .0004. The expected cost of failure with 
the backup would be the added cost of the backup component plus the failure cost: 

$100 + $20,000( .0004) = $108 

Because this ($108) is less than the expected cost without the backup ($400), adding the backup 
component is definitely cost-justified. 

Due to the extreme cost of interrupting production, a manufacturer has two standby machines 
available in case a particular machine breaks down. The machine in use has a reliability of .94, 
and the backups have reliabilities of .90 and .80. In the event of a failure, a backup machine is 
brought into service. If this machine also fails, the other backup is used. Calculate the system 
reliability. 

fil = 0.94, R 2 = .90, and P3 = .80 

The system can be depicted in this way: 



^ system - Pi + P2 U “ Pi) + P3 (1 ” P2MI _ Pi) 

=.94 + .90(1 - .94) +.80(1 - .90)(1 -.94) = .9988 


A hospital has three/ncfepencfentfire alarm systems, with reliabilities of .95, .97, and .99. In the 
event of a fire, what is the probability that a warning would be given? 

A warning would not be given if all three alarms failed. The probability that at least one alarm 
would operate is 1 - P(none operate): 

P(none operate) = (1 - .95)(1 - 97)(1 -.99) = .000015 
P (warning) = 1 -.000015 = .999985 

Alternatively, P(warning) = .95 + .97(1 - .95) +.99(1 - .95) (1 - .97) = .999985 

A weather satellite has expected life of 10 years from the time it is placed into Earth's orbit. 
Determine its reliability for each of the following lengths of service (assume that Exponential 
distribution is appropriate.) 

a. 5 years 

b. 12 years 

c. 20 years 

d. 30 years 


Solved Problems 


Problem 1 


Solution 


Problem 2 


Solution 


Problem 3 


Solution 


Problem 4 
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Solution 


Problem 5 
Solution 


Problem 6 


MTTF = 10 years 

Calculate the ratio 77M TTF for 7 = 5, 12, 20, and 30, and obtain the values of e _7/MTTF from 
Table 4S-1. These are the solutions (= reliabilities). 



T 

MTTF 

T/ MTTF 

e -T/MTTF 

a. 

5 

10 

.50 

.6065 

b. 

12 

10 

1.20 

.3012 

c. 

20 

10 

2.00 

.1353 

d. 

30 

10 

3.00 

.0498 


What is the probability that the satellite described in Solved Problem 4 will fail between 5 and 
12 years after being placed into Earth's orbit? 

P(5 years < failure < 12 years) = P(failure after 5 years) 

-P (failure after 12 years) 

Using the probabilities shown in the previous solution, we obtain: 

P(failure after 5 years) = .6065 

-P(failure after 12 years) = .3012 

.3053 

See the following chart: 



Years 


One line of specialty tires has a wear-out life that can be modelled using Normal distribution 
with a mean of 25,000 km and a standard deviation of 2,000 km. Determine each of the 
following: 

a. The percentage of tires that can be expected to wear out within ±2,000 km of the average 
(i.e., between 23,000 km and 27,000 km). 


Solution 


b. The percentage of tires that can be expected to fail between 26,000 km and 29,000 km. 

c. For what tire life would you expect 4 percent of the tires to have worn out? 

Note: Kilometres are analogous to time and are handled in exactly the same way. 
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a. The phrase "within ± 2,000 km of the average" translates to within one standard deviation of 
the mean because the standard deviation equals 2,000 km. Therefore, the range of z is 
z = -1.00 to z = +1.00, and the area under the curve between those points is found as the 
difference between P(z< +1.00) and P(z< -1.00), using values obtained from Appendix B, 
Table B. 

P(z< +1.00) = .8413 
-P(z< -1.00) = .1587 


P(—1.00 <z< +1.00) = .6826, which means 68.26% of tires will wear out between 23,000 km 
and 27,000 km (see the following chart): 



b. Wear-out mean = 25,000 km 

Wear-out standard deviation = 2,000 km 
P (26,000 < Wear-out < 29,000) = P (z < z 2900 o) -P[z< z 26 ,ooo) 

29,000- 25,000 - nn ,, A , D T ., 

z 29/ ooo =-2000-= + 2.00^.9772 (from Appendix B,Table B) 


Z 26,000 


26,000- 25,000 

2,000 


+ .50^.6915 (from Appendix B,Table B) 


The difference is .9772 - .6915 = .2857, which means 28.57 percent of tires will wear out 
between 26,000 km and 29,000 km (see the following chart). 


.9772 


.6915 


n 


.2857 


0 + 0.50 + 2.00 


c. Use A ppendix B, Table B to find z for 4 percent: z = -1.75 
Find tire life using jll + zo: 25,000 - 1.75(2,000) = 21,500 km. 
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Discussion and 
Review Questions 


1. Define the term reliability and give an example. (L01) 

2. Explain why a product might have an overall reliability that is low even though its components 
have fairly high reliabilities. (L02) 

3. What is redundancy and how can it improve product reliability? (L02) 

4. How is failure rate per hour calculated? MTTF? Give an exampleof both. (L03) 

5. What is the significance of the bathtub curve in reliability? Give an example of an item with 
failure rate in each phase. (L03) 

6. How is reliability determined if the distribution of time to failure is Exponential? (L03) 

7. What is availability and how can it be increased? (L03) 


Internet Exercises 


1. Visit either http://www.smrpjobboard.com or http://www.sre.org/current/current.htm, pick a 
reliability job announcement, and briefly summarize the duties involved. (L01) 

2. Read http://asq.org/certification/reliability-engineer/bok.html, and briefly summarize the 
knowledge and skills required of a Certified Reliability Engineer. (L01) 

3. Choose one of the following case studies and summarize it: (L01) 

a. http://www.ptc.com/WCM S/files/114064/en/5018-RelexOpSim-E nphase-cs-en.pdf 

b. http://www.ptc.com/WCM S/files/114063/en/cs-4864-H R_Textron-Weibull.pdf 

4. Visit http://www.itl.nist.gov/div898/handbook/apr/sectionl/aprl8.htm, find the information 
about the Standby System and parallel system, and briefly explain how they differ. (L03) 

5. Visit http://www.sqconline.com/resistor-mtbf-mil-hdbk-217-rev-f-notice-2, find the failure 
rate and MTBF of a resistor, and identify the relationship between the two. (L03) 



Determine the probability that the system will operate under each of these conditions: 

a. The system as shown. 

b. Each system component has a backup with a reliability of .90 and a switch that is 100 per¬ 
cent reliable. 

c. Each system component has a backup with .90 reliability and a switch that is 99 percent 
reliable. 

2. A product is composed of four parts. I n order for the product to function properly, each of the 
parts must function. Two of the parts each have .96 probability of functioning, and the other 
two each have probability of .99. What is the overall probability that the product will function 
properly? (L02) 

3. A system consists of three identical components. In order for the system to perform as in¬ 
tended, all of the components must perform. Each component has the same probability of 
performance. If the system is to have .92 probability of performing, what probability of per¬ 
formance is needed by each of the individual components? (L02) 

4. A product engineer has developed the following equation forthecost of acomponent: C =(10P) 1 2 3 4 5 , 
where C is the cost in dollars and P is the probability that the component will operate as ex¬ 
pected. The product is composed of two of these components, both of which must operate for the 
product to operate. The engineer can spend a total of $173 for the two components. To the nearest 
two decimal places, what is the largest component reliability that can be achieved? (L02) 

5. The guidance system of a ship is controlled by a computer that has three major modules. In 
order for the computer to function properly, all three modules must function. Two of the mod¬ 
ules have reliability of .97, and the other has reliability of .99. (L02) 

a. W hat is the reliability of the computer? 

b. A backup computer identical to the one being used can be installed to improve overall reli¬ 
ability. Assuming that the new computer can automatically function if the first computer 
fails, determine the resulting reliability. 

c. If the backup computer must be activated by a switch in the event that the first computer 
fails, and the switch has a reliability of .98, what is the overall reliability of the system? 
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(Both the switch and the backup computer must function in order for the backup system to 
function.) 

6. One of the industrial robots designed by a leading producer has four major components. Com¬ 
ponents' reliability are .98, .95, .94, and .90. All of the components must function in order for 
the robot to operate effectively. (L02) 

a. Calculate the reliability of the robot. 

b. Designers want to improve the reliability of the robot by adding a backup component. Due 
to space limitations, only one backup can be added. The backup for any component will 
have the same reliability as the unit for which it is the backup. Which component should 
get the backup in order to achieve the highest reliability of the robot? 

c. If one backup with a reliability of .92 can be added to any one of the main components, 
which component should get it to achieve the highest overall reliability? 

7. A production line has three machines A, B, and C, with reliabilities of .99, .96, and .93, respec¬ 
tively. The machines are arranged so that if one breaks down, the others must shut down. En¬ 
gineers are weighing two alternative designs for increasing the line's reliability. Plan 1 involves 
adding an identical backup line (i.e., a series backup), and plan 2 involves providing a backup 
for each machine (i.e., a parallel backup). In either case, three additional machines (A, B, and 
C) would be used with reliabilities equal to the original three. (L02) 

a Which plan will provide higher reliability? 

b. Explain why the two reliabilities are not the same. 

c. W hat other factors might enter into the decision of which plan to adopt? 

8. Refer to the previous problem. (L02) 

a. Assume that a single switch is used in plan 1 to transfer production to the backup line if the 
first line failed, and this switch is 98 percent reliable, while reliabilities of the machines 
remain the same. Recalculate the reliability of plan 1. Compare this reliability with the reli¬ 
ability of plan 1 calculated in solving the original problem. How much did reliability of 
plan 1 decrease as a result of a 98-percent-reliable switch? 

b. Assume that three switches are used in plan 2 to transfer production to the backup machines 
if the original machines failed, and these switches are all 98 percent reliable, while reliabili¬ 
ties of the machines remain the same. Recalculate the reliability of plan 2. Compare the 
reliability of this plan with the reliability of plan 2 calculated in solving the original prob¬ 
lem. How much did reliability of plan 2 decrease? 

9. A Web server has five major components that must all function in order for it to operate as 
intended. Assuming that each component of the system has the same reliability, what is 
the reliability each one must have in order for the overall system to have a reliability 
of .98? (L02) 

10. Repeat Problem 9 under the condition that one of the components will have a backup with 
reliability equal to that of any one of the other components. (L02) 

11. Hoping to increase the chances of reaching a performance goal, the director of a research 
project has assigned the same task to three separate research teams. The director estimates that 
the team probabilities for successfully completing the task in the allotted time are .9, .8, and .7. 
Assuming that the teams work independently, what is the probability that the project will not 
be completed in time? (L02) 

12. An electronic chess game has a useful life that is Exponential with a mean of 30 months. 
Determine each of the following: (L03) 

a. The probability that any given unit will operate for at least (1) 39 months, (2) 48 months, 
(3) 60 months. 

b. The probability that any given unit will fail sooner than (1) 33 months, (2) 15 months, (3) 
6 months. 

c. The length of service time after which the percentage of failed units will approximately 
equal (1) 50 percent, (2) 85 percent, (3) 95 percent, (4) 99 percent. 

13. A manufacturer of programmable calculators is attempting to determine a reasonable warranty 
period for a model it will introduce shortly. The manager of product testing has indicated that 
the calculators have an expected life of 30 months. Assume product life can be described by 
Exponential distribution. (L03) 

a. If warranties are offered for the expected life of the calculators, what percentage of those 
sold would be expected to fail during the warranty period? 
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b. What warranty period would result in a failure chance of approximately 10 percent? 

14. A type of light bulb has a life that is Exponentially distributed with a mean of 5,000 hours. 
Determine the probability that one of these light bulbs will last: (L03) 

a. At least 6,000 hours. 

b. No longer than 1,000 hours. 

c. Between 1,000 hours and 6,000 hours. 

15. According to its designers, a satellite will have an expected life of six years. Assume that Ex¬ 
ponential distribution applies. Determine the probability that it will function for each of the 
following time periods: (L03) 

a. M ore than 9 years. 

b. Less than 12 years. 

c. M ore than 9 years but less than 12 years. 

d. At least 21 years. 

16. An office manager has received a report from a consultant on equipment replacement. The 
report indicates that the scanners have a service life that is Normally distributed with mean of 
41 months and standard deviation of 4 months. On the basis of this information, determine the 
percentage of scanners that can be expected to fail in the following time periods: (L03) 

a. Before 38 months of service 

b. Between 40 and 45 months of service 

c. Within 2 months of the mean life 

17. A copier manufacturer has determined that its major product has a service life that can be 
modelled by Normal distribution with mean of six years and standard deviation of half year. 
(L03) 

a. What probability can you assign to service lives of (1) at least five years? (2) at least six 
years? (3) at most seven and a half years? 

b. If the manufacturer offers service warranty of four years on these copiers, what percentage 
can be expected to fail during the warranty period? 

18. Refer to Problem 17. What warranty period would result in percentage failure of: (L03) 

a. 2 percent? 

b. 5 percent? 

19. Determine the availability for each of these cases: (L03) 

a. MTBF =40 days, MTTR = 53 days 

b. MTBF = 300 hours, MTTR = 6 hours 

20. A machine can operate for an average of 50 days before it needs to be overhauled, a process 
that takes two days. Calculate the availability of this machine. (L03) 

21. A manager must decide between two machines. M achine A has an average operating time of 
142 hours and an average repair time of 7 hours. Times for machine B are an average operating 
time of 65 hours and an average repair time of 2 hours. What is the availability of each ma¬ 
chine? (L03) 

22. A designer estimates that she can (a) increase the average time to failure of a product by 5 
percent at a cost of $450, or ( b ) reduce the average repair time by 10 percent at a cost of $200. 
Which option would be more cost-effective? Currently, the average time to failure is 100 hours 
and the average repair time is 4 hours. (L03) 

23. A battery's life is Normally distributed with mean of 4.7 years and standard deviation of .3 
year. The batteries are warranted to operate for a minimum of four years. If a battery fails 
within the warranty period, it will be replaced with a new battery at no charge. (L03) 

a. What percentage of batteries would you expect to fail before the warranty period 
expires? 

b. The manager is toying with the idea of using the same battery with a different exterior, 
labelling it as a premium battery, and offering a 54-month warranty on it. What percentage 
of "premium" batteries would you expect to fail before the warranty period expires? 

24. In practice, for a series system the failure rate is estimated by adding the failure rate of its 
components. For a system made of n identical components in series, each having a probability 


SUPPLEMENT TO CHAPTER 4 Reliability 


17 


of failure = P f , probability of system failure is approximately n(P f ) provided that P f is 
sufficiently small. Choose a value of n > 1 and P f < .05, and show the above result. (L02) 

25.The MTTF of the central processing unit (CPU) of a single board computer is estimated to 
be 150,000 hours. You can assume Exponential distribution for operating time of this com¬ 
ponent until failure. What is the probability that this component will operate without failure 
for: (L03) 

a. 2.5 years? 

b. 5 years? 

c. 10 years? 

(Hint: Use a calculator, instead of Table 4S-1, to obtain more accurate probabilities.) 

*26. A study was performed to determine the reliability of components of personal computers (PCs) 
used by Rolls-Royce staff. 2 The operating lives of the components of 341 PCs were measured 
over a 22-month period. The study fit the Wei bull distribution to the sample operating lives of 
each component and based on the estimated parameters k (shape) and X (scale), calculated its 
average life or MTTF. (L03) 


f(x;X,k ) 




X 




k -1 


x/X) k 


0 


x>0 , 
x < 0, 


In the above formula, x represents time. The mean of the Wei bull distribution is X r(l + Ilk) 
where r(y) can be computed in Excel using " = EXP(GAM MALN(y))". While some compo¬ 
nents had a larger failure rate at the beginning of their life (e.g., hard disks with k= .51), oth¬ 
ers had almost constant failure rate over time (e.g., motherboards with k = .99). M ice had 
k = .86, X = 22,440, and average life of 24,000 hours; keyboards had k = .76, X = 41,919, and 
average life of 49,000 hours; hard disks had k=. 51, A, = 136,752, and average life of 
264,000 hours; and monitors had k = .76, X = 58,395, and average life of 69,000 hours. 

a. For motherboards, k = .99 and X = 49,171. Estimate the average life of a motherboard. 

b. Suppose 90 motherboards (out of 341) failed during the study period (22 months), and their 
lives added to 540,000 hours. Estimate the average life of a motherboard (in hours) directly 
(i.e., not using Weibull distribution). 

27. A car has four independent and identical tires. The reliability of a tire is .99. If any tire is flat, 
the car cannot be driven. Calculate the reliability of a car with respect to its tires. (LO2) 

28. A computer has two independent and identical Central Processing Units (CPUs). The com¬ 
puter will operate if at least one CPU operates. The reliability of a CPU is .99. Calculate the 
reliability of the computer with respect to its CPUs. (L02) 

*29. A communication network between two cities, A and B, consists of five independent and iden¬ 
tical relay units forming a bridge network as shown below. For the network to work, a least one 
path between the two cities should work. If the reliability of each relay is .99, what is the reli¬ 
ability of the network R N ? Hint: R N = 1 - sum of probability that any minimal two or three 
relays that together cut off the network will fail simultaneously. Specifically, any of sets of 
relays 1&4, 2&5,1&3&5, and 2&3&4 cut off the two cities. (L O2) 


City A 


1 2 



4 5 


City B 


2 M . Bradley and R. Dawson, “The Cost of U nreliability: A Case Study," Journal of Quality in Maintenance 
Engineering , 4(3), 1998, pp. 212-218. 
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30. A plane has two independent and identical engines. At least one engine should operate for the 
plane not to crash. The reliability of an engine is .999. Calculate the reliability of the plane 
with respect to its engines. (L02) 

31 Suppose that the failure rate of a tire is .0001 per hour of driving (with time to failure being 
Exponential), and that the failure rates of the four tires are independent and identical. (L03) 

a. What is the failure rate of the car with respect to its tires? Hint: It is the sum of the failure 
rate of the four tires because any tire's failure would result in car's failure. 

b. Calculate the mean time to a tire failure for the car. 

c. Calculate the reliability of the car with respect to its tires if a 25-hour journey is going to 
take place. 

Hint: The distribution of time to failure for the car is also Exponential. 

*32. A computer has two independent and identical Central Processing Units (CPUs) that are both 
used when the computer is on. Suppose that the failure rate of a CPU is .0001 failures per hour 
(with time to failure being Exponential), and the computer will operate if at least one CPU 
operates. Calculate the mean time to CPU failure of the computer. Hint: It can be shown that 

MTTF P = xl-j 

1=1 


where A is the failure rate of one CPU and m is the number of parallel CPUs (in this casem = 2). 
(L03) 

*33. A standby system consists of two independent and identical units. When the first unit fails, the 
second unit (standby) kicks in. Note that unlike the parallel system of the previous problem, 
the standby component is brought into operation only when needed. A unit's failure rate is 
.0001 failures per hour (with time to failure being Exponential). Calculate the system mean 
time to failure (i.e., both units failing). Hint: It can be shown that 


MTTF = 


m 


X 


where X is the failure rate of a unit and m is the number of units (in this case m = 2). (L03) 

*34.The following data shows the time to failure of 100 units of a generator that were run until 
failure. 3 (L03) 

a. Compute the failure rate during each month as a proportion of numbers that survived until 
the beginning of that month. Draw these failure rates against months. Determine if the 
failure rate is decreasing, constant, or increasing. 

b. If we assume that failure rate is constant, determine the average failure rate (per month) 
and MTTF (in months). 


Month 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 

No. failed 1 0 2 1 4 10 18 22 14 8 10 5 3 1 0 1 

35. Five units of an electronic component were subjected to stress testing until they failed 4 . The 
observed failure times (in days) were 550, 680, 820, 910, and 1,110. Calculate the M TTF and 
the failure rate for this component. (L03) 

36. Twenty units of a component with constant failure rate were subjected to high-stress testing. 
After 25 hours, seven failed at times (in hours) 2.1, 8.3,10.9,15.2,16.3, 20.5, 23.8. 5 Calculate 
the M TTF and the failure rate for this component. (L03) 

*37. The foil owing data shows the number of failures of 1,000 units of an electronic component in 
time intervals of 100 hours. 6 (L03) 

a. Compute the failure rate during each time interval as a proportion of numbers that survived 
until the beginning of that interval. What can you conclude about the failure rates? 

b. Calculate the MTTF for this component. 


3 R. D. Leitch, Basic Reliability Engineering Analysis, London: Butterworths, 1988, p. 21. 

4 R. D. Leitch, Basic Reliability EngineeringAnalysis, London: Butterworths, 1988, p. 33. 

5 R. D. Leitch, Basic Reliability E ngineering Analysis, London: Butterworths, 1988, p. 34. 

6 D.J. Klinger et al., Editors, AT& T Reliability Manual, New York: Van Nostrand Reinhold, 1990, p. 7. 
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Time Interval 

No. of Failures 

O-lOO 

95 

100-200 

86 

200-300 

78 

300-400 

70 

400-500 

64 

500-600 

58 

600-700 

52 

700-800 

47 

800-900 

42 

900-1000 

39 


38. A piece equipment contains 300 integrated circuits, 25 amplifiers, 150 transistors, 500 resis¬ 
tors, and 450 capacitors. 7 Failure of any component causes a system failure. Using the esti¬ 
mated failure (hazard) rates below, calculate the failure (hazard) rate of the equipment. (L03) 

Component Hazard rate (failures per billion hour) 


Integrated circuit 

10 

Amplifier 

30 

Transistor 

20 

Resistor 

1 

Capacitor 

1 


MINI-CASE 


Engineer Tank 8 

E ngineer tank is an armoured vehicle used on the battlefield 
to perform engineering tasks such as opening routes, dig¬ 
ging, and bulldozing. During design, the mean time between 


the need for unscheduled maintenance (other than loss of pro¬ 
pulsion) was set to 420 hours. (The target for MTBF for loss 
of propulsion is 3,500 hours). During reliability testing for 70 
engineer tanks, the need for unscheduled maintenance arose 
after the foil owing number of hours: (L03) 



No. 

Fa ed hr 

No. 

Failed hr 

No. 

Failed hr 

No. 

Failed hr 

No. 

Failed hr 

No. 

Failed hr 

No. 

Failed hr 

1 

9 

11 

52 

21 

105 

31 

178 

41 

232 

51 

284 

61 

416 

2 

18 

12 

64 

22 

114 

32 

194 

42 

232 

52 

291 

62 

425 

3 

22 

13 

68 

23 

122 

33 

195 

43 

239 

53 

292 

63 

441 

4 

23 

14 

75 

24 

126 

34 

198 

44 

241 

54 

294 

64 

451 

5 

27 

15 

88 

25 

126 

35 

202 

45 

244 

55 

315 

65 

454 

6 

33 

16 

89 

26 

130 

36 

210 

46 

247 

56 

323 

66 

456 

7 

36 

17 

91 

27 

151 

37 

215 

47 

247 

57 

325 

67 

557 

8 

42 

18 

94 

28 

153 

38 

216 

48 

252 

58 

350 

68 

560 

9 

46 

19 

95 

29 

165 

39 

223 

49 

273 

59 

360 

69 

947 

10 

50 

20 

99 

30 

168 

40 

223 

50 

275 

60 

378 

70 

997 


a. M ake a histogram of these times. Does the distribution look like Exponential? 

b. Compute the mean. Is the target MTBF met? 


7 D. J. Klinger etal, Editors, AT& T Reliability M anual, New York: Van Nostrand Reinhold, 1990, p. 94. 

8 U. D. Kumar etal., Reliability and Six Sigma, New York: Springer, 2006, p. 102. 
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MINI-CASE 


Sonar System 9 


he components of an underwater sonar system are given in the block diagram below: 



The failure rate of one unit of each component is estimated below using available data tables: 


Subsystem 

Failure rate 
X/10 6 hours 

Transducer 

1,000 

Power Supply 

20 

Heat Exchanger 

20 

Transmitter/Receiver Switch 

6 

Power Amplifier 

120 

Pre Amplifier 

32 

Front-end Processor 

400 

System Controller (parallel) 

10 

SDLC Bus (parallel) 

15 

Signal Processor 

450 

Display Processor (parallel) 

150 

Display Monitor (parallel) 

50 

Audio Processor 

20 


Calculate the reliability of the sonar system to work 100 hours failure-free. Hint: For 2 parallel components, the overall failure 
rate of the two is the failure rate of one divided by 1.5 (see Problem 32 above). 


9 U. D. Kumar etal., Reliability and Six Sigma, New York: Springer, 2006, pp. 139-141. 




























































































